Bayesian Finite Population Imputation for Data Fusion

نویسنده

  • Jerome P. Reiter
چکیده

In data fusion, data owners seek to combine datasets with disjoint observations and distinct variables to estimate relationships among the variables. One approach is to concatenate the files, specify models relating the variables not jointly observed, and use the models to generate multiple imputations of the missing data. We show that the standard multiple imputation estimator of the sampling variance can have positive bias in such contexts. We present an approach for correcting this problem based on Bayesian finite population inference. We also present an approach for data fusion when some values are confidential and cannot be shared.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Imputation of parent-offspring trios and their effect on accuracy of genomic prediction using Bayesian method

The objective of this study was to evaluate the imputation accuracy of parent-offspring trios under different scenarios. By using simulated datasets, the performance Bayesian LASSO in genomic prediction was also examined. The genome consisted of 5 chromosomes and each chromosome was set as 1 Morgan length. The number of SNPs per chromosome was 10000. One hundred QTLs were randomly distributed a...

متن کامل

A comparison of imputation methods for handling missing scores in biometric fusion

Multibiometric systems, which consolidate or fuse multiple sources of biometric information, typically provide better recognition performance than unimodal systems. While fusion can be accomplished at various levels in a multibiometric system, score-level fusion is commonly used as it offers a good tradeoff between data availability and ease of fusion. Most score-level fusion rules assume that ...

متن کامل

Effect of Reference Population Size and Imputation Methods on the Accuracy of Imputation in Pure and Mixed Populations

    Imputation as a method of creating low-density chips to high-density chips has been introduced to increase the accuracy of genomic selection in animals. In the current study, to investing imputation accuracy, three populations of mixed (scenario 1), pure (scenario 2) and mixed + pure (scenario 3) were simulated using QMSim. Two methods of imputation including Beagle and Flmpute were used fo...

متن کامل

A decision theoretic approach to Imputation in finite population sampling

SUMMARY Consider the situation where observations are missing at random from a simple random sample drawn from a finite population. In certain cases it is of interest to create a full set of sample values such that inferences based on the full set will have the stated frequentist properties even though the statistician making those inferences is unaware that some of the observations were missin...

متن کامل

Bayesian Data Fusion: a Reliable Approach for Descriptive Modeling of Ore Deposits

Recognition of ore deposit genesis is still a controversial challenge for economic geologists. Here, this task was addressed by the virtue of Bayesian data fusion (BDF) implementing available proofs: semi-schematic examples with two (Cu and Pb + Zn) and three (Cu, Pb + Zn and Ag) evidences. The data, in current paper are just concentrations of indicated elements, were collected from Angouran’s ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011